WikiKreator: Improving Wikipedia Stubs Automatically
نویسندگان
چکیده
Stubs on Wikipedia often lack comprehensive information. The huge cost of editing Wikipedia and the presence of only a limited number of active contributors curb the consistent growth of Wikipedia. In this work, we present WikiKreator, a system that is capable of generating content automatically to improve existing stubs on Wikipedia. The system has two components. First, a text classifier built using topic distribution vectors is used to assign content from the web to various sections on a Wikipedia article. Second, we propose a novel abstractive summarization technique based on an optimization framework that generates section-specific summaries for Wikipedia stubs. Experiments show that WikiKreator is capable of generating well-formed informative content. Further, automatically generated content from our system have been appended to Wikipedia stubs and the content has been retained successfully proving the effectiveness of our approach.
منابع مشابه
Generating Natural Language from Linked Data: Unsupervised template extraction
We propose an architecture for generating natural language from Linked Data that automatically learns sentence templates and statistical document planning from parallel RDF datasets and text. We have built a proof-of-concept system (LOD-DEF) trained on un-annotated text from the Simple English Wikipedia and RDF triples from DBpedia, focusing exclusively on factual, non-temporal information. The...
متن کاملWikipedia Neuroscience Stub Editing in an Introductory Undergraduate Neuroscience Course
In response to the Society for Neuroscience initiative to help improve the neuroscience related content in Wikipedia, I implemented Wikipedia article construction and revision in my Introduction to Neuroscience course at Boston College as a writing intensive and neuroscience related outreach activity. My students worked in small groups to revise neuroscience "stubs" of their choice, many of whi...
متن کاملLanguage of Vandalism: Improving Wikipedia Vandalism Detection via Stylometric Analysis
Community-based knowledge forums, such as Wikipedia, are susceptible to vandalism, i.e., ill-intentioned contributions that are detrimental to the quality of collective intelligence. Most previous work to date relies on shallow lexico-syntactic patterns and metadata to automatically detect vandalism in Wikipedia. In this paper, we explore more linguistically motivated approaches to vandalism de...
متن کاملUncover What You See in Your Images: The InfoAlbum approach
This paper presents InfoAlbum, a novel prototype for image centric information collection, where the goal is to automatically provide the user with information about i) the object or event depicted in an image, and ii) the location where the image was taken. The system aims at improving the image viewing experience by presenting supplementary information such as location names, tags, weather co...
متن کاملTowards linking libraries and Wikipedia: automatic subject indexing of library records with Wikipedia concepts
In this article, we first argue the importance and timely need of linking libraries and Wikipedia for improving the quality of their services to information consumers, as such linkage will enrich the quality of Wikipedia articles and at the same time increase the visibility of library resources which are currently overlooked to a large degree. We then describe the development of an automatic sy...
متن کامل